Search CORE

320 research outputs found

Approximating the double-cut-and-join distance between unsigned genomes

Author: A Bergeron
A Caprara
A Caprara
CM Papadimitriou
G Lin
H Jiang
JD Kececioglu
Jiadong Yu
MM Halldórsson
R Sun
Ruimin Sun
S Hannenhalli
S Hannenhalli
S Hannenhalli
S Yancopoulos
V Bafna
X Chen
Xin Chen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

In this paper we study the problem of sorting unsigned genomes by double-cut-and-join operations, where genomes allow a mix of linear and circular chromosomes to be present. First, we formulate an equivalent optimization problem, called maximum cycle/path decomposition, which is aimed at finding a largest collection of edge-disjoint cycles/AA-paths/AB-paths in a breakpoint graph. Then, we show that the problem of finding a largest collection of edge-disjoint cycles/AA-paths/AB-paths of length no more than l can be reduced to the well-known degree-bounded k-set packing problem with k = 2l. Finally, a polynomial-time approximation algorithm for the problem of sorting unsigned genomes by double-cut-and-join operations is devised, which achieves the approximation ratio for any positive ε. For the restricted variation where each genome contains only one linear chromosome, the approximation ratio can be further improved t

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DR-NTU (Digital Repository of NTU)

The Fibers and Range of Reduction Graphs in Ciliates

Author: A. Bergeron
A. Ehrenfeucht
Hendrik Jan Hoogeboom
J. Setubal
P. Pevzner
R. Brijder
R. Brijder
R. Brijder
Robert Brijder
S. Hannenhalli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/02/2007
Field of study

The biological process of gene assembly has been modeled based on three types of string rewriting rules, called string pointer rules, defined on so-called legal strings. It has been shown that reduction graphs, graphs that are based on the notion of breakpoint graph in the theory of sorting by reversal, for legal strings provide valuable insights into the gene assembly process. We characterize which legal strings obtain the same reduction graph (up to isomorphism), and moreover we characterize which graphs are (isomorphic to) reduction graphs.Comment: 24 pages, 13 figure

arXiv.org e-Print Archive

Crossref

CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features

Author: Apreleva Sofia
Bartolomei Marisa S
Essien Kobby
Hannenhalli Sridhar
Singh Larry N
Vigneau Sebastien
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

CTCF DNA binding sites are classified into distinct functional classes, with distinct biological properties, shedding light on the differing functional roles of CTCF binding

Crossref

PubMed Central

An asymmetric approach to preserve common intervals while sorting by reversals

Author: A Bergeron
A Bergeron
A Bergeron
A Caprara
A Siepel
BME Moret
Christian Gautier
DA Bader
E Tannier
G Blanc
G Tesler
M Bader
M Bernt
Marie-France Sagot
Marília DV Braga
MDV Braga
MDV Braga
S Berard
S Berard
S Hannenhalli
S Hannenhalli
S Heber
Y Diekmann
Y Han
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Dias Vieira Braga M, Gautier C, Sagot M-F. An asymmetric approach to preserve common intervals while sorting by reversals. Algorithms for Molecular Biology. 2009;4(1):16.Background: The reversal distance and optimal sequences of reversals to transform a genome into another are useful tools to analyse evolutionary scenarios. However, the number of sequences is huge and some additional criteria should be used to obtain a more accurate analysis. One strategy is searching for sequences that respect constraints, such as the common intervals (clusters of co-localised genes). Another approach is to explore the whole space of sorting sequences, eventually grouping them into classes of equivalence. Recently both strategies started to be put together, to restrain the space to the sequences that respect constraints. In particular an algorithm has been proposed to list classes whose sorting sequences do not break the common intervals detected between the two inital genomes A and B. This approach may reduce the space of sequences and is symmetric (the result of the analysis sorting A into B can be obtained from the analysis sorting B into A). Results: We propose an alternative approach to restrain the space of sorting sequences, using progressive instead of initial detection of common intervals (the list of common intervals is updated after applying each reversal). This may reduce the space of sequences even more, but is shown to be asymmetric. Conclusions: We suggest that our method may be more realistic when the relation ancestor-descendant between the analysed genomes is clear and we apply it to do a better characterisation of the evolutionary scenario of the bacterium Rickettsia felis with respect to one of its ancestors

Crossref

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Publications at Bielefeld University

Hal-Diderot

Bayesian Integration of Genetics and Epigenetics Detects Causal Regulatory SNPs Underlying Expression Variability

Author: Cappola Thomas P
Consortium MAGNet
Das Avinash
Hakonarson Hakon
Hannenhalli Sridhar
Jensen Shane T
Margulies Kenneth B
Moravec Christine S
Morley Michael
Tang W.H.W.
Publication venue: ScholarlyCommons
Publication date: 12/10/2015
Field of study

The standard expression quantitative trait loci (eQTL) detects polymorphisms associated with gene expression without revealing causality. We introduce a coupled Bayesian regression approach—eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combination of regulatory single-nucleotide polymorphisms (SNPs) that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance but also predicts gene expression more accurately than other methods. Based on realistic simulated data, we demonstrate that eQTeL accurately detects causal regulatory SNPs, including those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal

PubMed Central

ScholarlyCommons@Penn

A Unifying Model of Genome Evolution Under Parsimony

Author: A Bergeron
A Caprara
AE Darling
AW Xu
B Paten
B Paten
B Paten
B Raphael
Benedict Paten
C Chauve
D Bienstock
Daniel R Zerbino
David Haussler
E Tannier
G Bourque
Glenn Hickey
I Elias
J Edmonds
J Felsenstein
J Kim
J Ma
L Chindelevitch
LL Wang
M Alekseyev
M Bader
M Blanchette
M Shao
MD Braga
N El-Mabrouk
N El-Mabrouk
O Westesson
P Medvedev
S Hannenhalli
S Yancopoulos
S Yancopoulos
W Day
W Miller
YS Song
Publication venue
Publication date: 12/05/2014
Field of study

We present a data structure called a history graph that offers a practical basis for the analysis of genome evolution. It conceptually simplifies the study of parsimonious evolutionary histories by representing both substitutions and double cut and join (DCJ) rearrangements in the presence of duplications. The problem of constructing parsimonious history graphs thus subsumes related maximum parsimony problems in the fields of phylogenetic reconstruction and genome rearrangement. We show that tractable functions can be used to define upper and lower bounds on the minimum number of substitutions and DCJ rearrangements needed to explain any history graph. These bounds become tight for a special type of unambiguous history graph called an ancestral variation graph (AVG), which constrains in its combinatorial structure the number of operations required. We finally demonstrate that for a given history graph

G

, a finite set of AVGs describe all parsimonious interpretations of

G

, and this set can be explored with a few sampling moves.Comment: 52 pages, 24 figure

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement

Author: David Sankoff
Hannenhalli S.
Pevzner P.A.
Phil Trinh
Sankoff D.
Sankoff D.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Genomic distance under gene substitutions

Author: A Bergeron
BT Lahn
C Moritz
Jens Stoye
JL Boore
Leonardo C Ribeiro
Marília D V Braga
MDV Braga
MDV Braga
MDV Braga
MT Ross
N El-Mabrouk
Raphael Machado
S Hannenhalli
S Ohno
S Yancopoulos
S Yancopoulos
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Dias Vieira Braga M, Machado R, Ribeiro LC, Stoye J. Genomic distance under gene substitutions. BMC Bioinformatics. 2011;12(Suppl 9: Proc. of RECOMB-CG 2011): S8.Background: The distance between two genomes is often computed by comparing only the common markers between them. Some approaches are also able to deal with non-common markers, allowing the insertion or the deletion of such markers. In these models, a deletion and a subsequent insertion that occur at the same position of the genome count for two sorting steps. Results: Here we propose a new model that sorts non-common markers with substitutions, which are more powerful operations that comprehend insertions and deletions. A deletion and an insertion that occur at the same position of the genome can be modeled as a substitution, counting for a single sorting step. Conclusions: Comparing genomes with unequal content, but without duplicated markers, we give a linear time algorithm to compute the genomic distance considering substitutions and double-cut-and-join (DCJ) operations. This model provides a parsimonious genomic distance to handle genomes free of duplicated markers, that is in practice a lower bound to the real genomic distances. The method could also be used to refine orthology assignments, since in some cases a substitution could actually correspond to an unannotated orthology

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Genome rearrangements with duplications

Author: B Hiller
C Zheng
D Bertrand
D Bryant
D Sankoff
D Sankoff
Drosophila 12 Genomes Consortium
F Cabanillas
F Mitelman
F Mitelman
G Blanc
G Blin
J Salse
K Swenson
M Bader
M Marron
M Ozery-Flato
M Ozery-Flato
M Ozery-Flato
Martin Bader
N El-Mabrouk
N El-Mabrouk
N El-Mabrouk
S Gog
S Hannenhalli
S Hannenhalli
S Yancopoulos
S Yancopoulos
T Hartman
T Hartman
V Bafna
X Chen
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Identification and Functional Characterization of Gene Components of Type VI Secretion System in Bacterial Genomes

Author: A Pautsch
C von Mering
H Schmidt
I Rajan
JD Mougous
K Ehrbar
K Tamura
M Kostakioti
M Pellegrini
MA Schell
O Emanuelsson
S Pukatzki
S Pukatzki
S Yellaboina
Sakshi Shrivastava
Sharmila S. Mande
Sridhar Hannenhalli
T Zusman
Y Wang
Publication venue: Public Library of Science
Publication date: 13/08/2008
Field of study

A new secretion system, called the Type VI Secretion system (T6SS), was recently reported in Vibrio cholerae, Pseudomonas aeruginosa and Burkholderia mallei. A total of 18 genes have been identified to be belonging to this secretion system in V. cholerae. Here we attempt to identify presence of T6SS in other bacterial genomes. This includes identification of orthologous sequences, conserved motifs, domains, families, 3D folds, genomic islands containing T6SS components, phylogenetic profiles and protein-protein association of these components. Our analysis indicates presence of T6SS in 42 bacteria and its absence in most of their non-pathogenic species, suggesting the role of T6SS in imparting pathogenicity to an organism. Analysis of genomic regions containing T6SS components, phylogenetic profiles and protein-protein association of T6SS components indicate few additional genes which could be involved in this secretion system. Based on our studies, functional annotations were assigned to most of the components. Except one of the genes, we could group all the other genes of T6SS into those belonging to the puncturing device, and those located in the outer membrane, transmembrane and inner membrane. Based on our analysis, we have proposed a model of T6SS and have compared the same with the other bacterial secretion systems

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central